KSR 1 Multiprocessor: Analysis of Latency Hiding Techniques in a Sparse Solver
نویسندگان
چکیده
This paper analyzes and evaluates some novel latency hiding features of the KSR1 multiprocessor: prefetch and poststore instructions and automatic updates. As a case study, we analyze the performance of an iterative sparse solver which generates irregular communications. We show that automatic updates signiicantly reduce the amount of communication. Although prefetch and poststore instructions reduce the coherence miss ratios, they do not signiicantly improve the sparse solver performance due to the overhead in executing these instructions.
منابع مشابه
Exploiting Communication Latency Hiding for Parallel Network Computing: Model and Analysis
Very large problems with high resource requirements of both computationand communicationcould be tackled with large numbers of workstations. However, for LAN-based networks, contention becomes a limiting factor, whereas latency appears to limit communication for WAN-based networks, nominally the Internet. In this paper, we describe a model to analyze the gain of communication latency hiding by ...
متن کاملData and Program Restructuring of Irregular Applications
Applications with irregular data structures such as sparse matrices or nite element meshes account for a large fraction of engineering and scientiic applications. Domain decomposition techniques are commonly used to partition these applications to reduce interprocessor communication on message passing parallel systems. Our work investigates the use of domain decomposition techniques on cache-co...
متن کاملPerformance of Grace Hash Join Algorithm on the Ksr-1 Multiprocessor: Evaluation and Analysis Performance of Grace Hash Join Algorithm on the Ksr-1 Multiprocessor: Evaluation and Analysis
In relational database systems, the join is one of the most expensive but fundamental query operations. Among various join methods, the hash-based join algorithms show great potential as they lend themselves for parallelization. Although performance of the hash join algorithm has been evaluated for many architectures, to the best of our knowledge, it has not been evaluated for the COMA memory a...
متن کاملAnalysis of Memory Latency Factors and their Impact on KSR 1 MPP
The Kendall Square Research KSR1 MPP system has a shared address space, which spreads over physically distributed memory modules. Thus, memory access time can vary over a wide range even when accessing the same variable, depending on how this variable is being referenced and updated by the various processors. Since the processor stalls during this access time, the KSR1 performance depends consi...
متن کاملParallel sparse LU factorization on different message passing platforms
Several message passing-based parallel solvers have been developed for general (nonsymmetric) sparse LU factorization with partial pivoting. Existing solvers were mostly deployed and evaluated on parallel computing platforms with high message passing performance (e.g., 1–10 μs in message latency and 100–1000 Mbytes/sec in message throughput) while little attention has been paid on slower platfo...
متن کامل